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| Abstract 

We introduce particle systems in one or more dimensions in which particles perform 
branching Brownian motion and the population size is kept constant equal to N > 1, 
through the following selection mechanism: at all times only the N fittest particles survive, 
fc^H while all the other particles are removed. Fitness is measured with respect to some given 

score function s : R d — > R. For some choices of the function s, it is proved that the cloud 
t-H of particles travels at positive speed in some possibly random direction. In the case where 

s is linear, we show under some assumptions on the initial configuration that the shape 
of the cloud scales like log TV in the direction parallel to motion but at least c(log N) 3 / 2 
in the orthogonal direction for some c > 0. We conjecture that the exponent 3/2 is 
sharp. This result is equivalent to the following result of independent interest: in one- 
dimensional systems, the genealogical time is greater than c(log TV) 3 , thereby contributing 
a step towards the original predictions of Brunet and Derrida. We discuss several open 
problems and also explain how our results can be viewed as a rigorous justification of 
l — 1 Weismann's arguments for the role of recombination in population genetics. 

^ 1 Introduction 

in 

(N 1.1 Main results 

q 

^ Let d ^ 1 and let s : M. d i— > M. denote a fixed function, which we will refer to as the score 

or fitness function in what follows. We consider the following system of N particles in R d , 
(Xi(t), . . . , Xftr(i)) denned informally by the following two rules: 

^ • Each particle Xi follows the trajectory of an independent Brownian motion. 

• In addition each particle undergoes binary branching at rate 1. After each branching event, 
we remove from the population the particle i with minimal score, i.e., minion s(Xi(t)). 

Note in particular that the population size stays constant (equal to N) throughout time. 
Unless otherwise specified, we will always order particles Xi(t), . . . ,X^(t) by decreasing fit- 
ness, i.e., so that 

s(X 1 (t))^ ...^ s(X N (t)) (1) 
with arbitrary choice in case of a tie. 
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This process can be seen as a multi-dimensional generalisation of the model of branching 
Brownian motion with selection in R introduced by Brunet, Derrida, Mueller and Munier 
|10| lllj . This is the model which arises as a particular case of the above description with 
d = 1 and s(x) = x. 

The motivation for this process was the study of the effect of natural selection on the 
genealogy of a population. Using nonrigorous methods, Brunet et al. made several striking 
predictions, which we summarise below. Ordering the particles from right to left (so Xi(t) 
...X N (t)): 

def 

(i) Then for fixed N, lim t _ 5 . OC) (Xi(t)/t) = lim^oo ( Ajv (i) A) = vn, almost surely, where vn 
is a deterministic constant. 

(ii) As N — > oo, vn = Voo — c/(log N) 2 +o((\og N)~ 2 ), where Voo is the speed of the rightmost 
particle in a free branching Brownian motion (or free branching random walk if time is 
discrete), and c is an explicit constant. 

(hi) Finally, the genealogical time scale for this population is (logA^) 3 . More precisely, the 
genealogy of an arbitrary sample of the population, resealed by (log A^) 3 , converges to the 
Bolthausen-Sznitman coalescent (see for instance [3] for definitions and more discussion 
about this problem). 



The arguments of Brunet et al. |10|, [TT] relied on a nonrigorous analogy with noisy Fisher- 
Kolmogorov-Petrovskii— Piskounov (FKPP) equation 

du 1 d 2 u . ... 

and relied strongly on ideas developed earlier by Brunet and Derrida [TJ EJ [9] on the effect of 
noise on such an equation. For this reason this process is sometimes known as the Brunet- 
Derrida particle system. From a rigorous point of view, proofs of (i) and (ii) can be found 
in the paper of Berard and Gouere [2j, while a rigorous proof of (hi) can be found in [2] for 
a closely related model. However (hi) remains open for the original Brunet-Derrida process, 
though exciting progress in this direction has been achieved recently by Maillard |18j . 

The main goal of this paper is to study geometric properties of the d-dimensional systems 
and to partly resolve prediction 3 above in the case d = 1. We start with our results in d 
dimensions. Our results are valid in two particular cases: 



(Case A) Euclidean case: s(xi, . . . ,Xd) = y x\ + . . . + x\. 

(Case B) Linear case: for some vector A G M. d , s(x) = (X,x). 

See Figure [l] for two realisations of the process in the Euclidean case (case A). The linear 
case (case B) in the two dimensional case (d = 2) is particularly relevant from the point of view 
of applications, since it is reasonable to assume that for diploid populations, the total fitness 
of any given individual is a linear combination of the fitnesses of each of her alleles. (In this 
interpretation we thus view each coordinate as the fitness of the allele on the corresponding 
chromosome, and so the 'spatial' position has nothing to do with the geographical position 
of that individual in space. See below for further discussion about the biological relevance of 
our results.) 
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Figure 1: Two realisations of the particle system with N = 1000, d = 2, s(x,y) 
and jump distribution uniform in the unit disk. The particles are plotted after 20, 60, 100, 
150 and 200 generations with decreasing brightness. 



Simulations suggest that after an initial phase where the particles live in fragmented 
clusters on a circle of a given radius (which increases at linear speed), particles eventually 
aggregate in one clump, which travels at that speed in a random direction. A similar phe- 
nomenon is observed in simulations for the linear case (case B). Our first result makes this 
observation rigorous. In order to state it, it is convenient to introduce some notations. If 
t > and 1 < n < N, write X n {t) = R n (t)Q n (t), where R n (t) > and <d n (t) £ S^ 1 is 
continuous. Note that for d 2, almost surely X n (t) ^ for all t > and 1 ^ n ^ N. 



Theorem 1.1. Let N > 1 and consider a Brunet-Derrida process in 
driven by the Euclidean score function s(x) = \\x\\ (case A). Then, 



as t —> oo almost surely. 
Moreover, 



max 

ls£n,m^7V 



Ri(t) 



\X n {t) — X n 



0, 



with N particles, 



(3) 



VN, 



ei(t)->e, 



(4) 



where vn > is a deterministic constant and Q is distributed on S d 1 . Both these conver- 
gences hold almost surely. 

Remark 1.2. In the above theorem, ^ says that the particles eventually aggregate in one 
clump. On the other hand Q says that the clump travels at linear speed vn, in a randomly 
chosen direction 0. We will discuss below more precisely the diameter of the cloud of particles, 
which (for a fixed N , as t — > oo) stays of order one. 



3 



Remark 1.3. This theorem is actually true for a more general class of Brunet-Derrida 
systems than the ones discussed in this introduction and, indeed, in much of the paper. See 



Remark 2.13 for a discussion of the class of processes to which our proofs apply. 



We are also able to obtain a lower bound for the correct genealogical time for the one- 
dimensional process up to some mild conditions on the initial position of the particles. 
A similar result holds in the linear case: 

Theorem 1.4. Let N > 1 and consider a Brunet-Derrida process in M. d with N particles, 
driven by the linear score function s(x) = (A, x) for some A € E d - X (case B). Then, 



\\X n (i) — X mv i n 
max " — 0, (5) 



as t — y oo almost surely. 
Moreover, 

Xi(t) 



\v N , (6) 



t 

almost surely, where v n > is a deterministic constant. 

Remark 1.5. In particular, in this case, the direction of the cloud of particles is deterministic 
and is simply A. 



Remark 1.6. It is not hard to see that the vn appearing in Theorem \l.l\ and 1.4 are both 



equal to the asymptotic speed of a one-dimensional (standard) Brunet-Derrida system. Hence, 
adapting a result of Berard and Gouere f^j for branching Brownian motion, we get 

w ^-7W +0<(logW) " 2) ' <7) 

as N —7- 00. 

Our next results concern the dimensions of the cloud of particles. The simulations above 
suggest, somewhat counterintuitively, that the cloud of particles is more elongated in the 
direction orthogonal to the fitness gradient (and the limiting direction of the cloud). This is 
corroborated by a close-up view of the cloud of particles (see Figure [2j . 

We are able to establish this phenomenon under some reasonable assumptions on the 
initial condition, in case B. Fix A G S^ 1 and let X ± be an arbitrary unit vector such that 
(A,A X ) = 0. Define 



diann, = max \{X n (t) - X m (t),X)\ 



and 



diam. = max 

l<m,n€N 



(X n (t)-X m (t),X J 



Fix A G S^ 1 and for all x G R d let x = (A, x). 

We introduce an assumption on the initial condition which will be used in several results 
below. Let Xi(t), . . . ,X^(t) denotes the particles of a Brunet-Derrida system driven by the 
linear score function s(x) = x. Let X ri {t) = (X n (t), A), and label the particles by decreasing 
fitness Xi(t) ^ ... ^ X/v(i). Suppose that initially the system has a particle at Xi(0) = x 
and that for some 5 < 1, 

N 

^ e V2(X n (0)-x) ^ N S_ (g) 
n=l 
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Figure 2: Close-up on the cloud of particles with N = 1000. (a), s(x,y) = \\x\\, t = 1000. (b) 
s(x, y) = x + y, t = 200. 

Theorem 1.7. Assume Q. Then there exists c$ > (depending only on 5) such that for 
t = cs(log N) 3 , there exists a > such that 

liminf liminfP fdiam 4 ologiV, diamf ^ r?(log N) 3/2 ) =1. (9) 
?j— >0 N^oo \ ) 

In fact it can be shown that under the same initial condition, the order of magnitude of 
dianif really is log N, in the sense that we also have diam^ ^ a' log N with probability tending 
to 1 as N — > oo, for some constant a' < a. The phenomenon has important consequences in 
population genetics which are discussed below. 

We now make a series of comments on the meaning of the initial condition ^ . 

Remark 1.8. Intuitively, the condition ^ says that, after projecting onto Span(A), only 
polynomially many particles lie with logarithmic distance of the maximal particle. More pre- 
cisely, ^ holds as soon as there exists c > and £ < 1 such that at most particles lie in 
the interval [Xi(0) - clogiV, -Xi(O)]. 

Remark 1.9. An example of an initial condition which satisfies ^ with high probability is 
as follows: sample X\, ■ ■ . ,Xn in M. d independently according to a fixed distribution such that 
if X = (X, X), then for all x > 0, 

Cl e- aiX < F(X >x)^ c 2 e~ a2X (10) 

for some constant c\, ci and act, oti- 

Remark 1.10. We believe, but have been unable to prove, that if the initial condition is as in 
the above remark then ^ will in fact be satisfied at arbitrary large times. Indeed, comparing 
with results in we expect indeed that, at "equilibrium" (see Section 1.2 for definition), 
Xi(0) = (l/\/2) log N and 

Y N = Y\ e^ {0) « NL [ L e^ x ■ e~^ x sm^dx ~ cNL 2 , 
„ Jo L 
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where L = (l/\/2)(log N + 3 log log N). Hence the right-hand side of Q should be of order 
I? and thus ^ should be satisfied at equilibrium. Thus condition Q can be thought of as a 
condition specifying that the population is in a "metastable" state, as in 



As we will see, the result in Theorem 1.7 is closely related to estimates about the ge- 



nealogical timescale (or, more precisely, the time of the most recent common ancestor) in the 



population. In fact, Theorem 1.7 can be rephrased as follows: 



Theorem 1.11. Let N > 1 and consider a Brunet-Derrida system with N particles driven 
by the linear score function s{x) = x = (x, A). Assume that the initial condition satisfies 
Then there exists eg > ( depending only on 5) such that any particle with fitness greater than 
x at time has descendants alive at time cgilog iV) 3 with probability tending to 1 as N — )• oo. 

By projecting the particle system onto Span(A), we obtain a one-dimensional (standard) 
Brunet-Derrida system. Thus Theorem |1.11| applies verbatim to such systems, which partly 
confirms a prediction of \10\ [TT] (see item (hi) at the start of the introduction). 

The heart of the proof relies on delicate quantitative estimates concerning the displacement 
of the minimal position in one-dimensional (standard) Brunet-Derrida systems. This is a 
difficult quantity to study rigorously, as the evolution of the minimum depends on all the 
particles nearby, which make up all but a negligible fraction of the population. In particular, 
as a process it is non-Mar kovian and not continuous, though in the limit N — > oo it becomes 
deterministic and continuous. Our result is as follows. 

Proposition 1.12. Consider a (standard) one- dimensional Brunet-Derrida system with N 
particles, ordered by decreasing fitness X\(t) ^ . . . X^(t). Assume that the initial satisfies 
d8b. Let 



2vr 2 



y (lOg AT) 2 ' 

Then there exists eg > (depending only on 5) such that as N — > oo, 

F (X N (t) - x < /it, V t < c 5 (log AO 3 ) -> 1. (11) 

A corresponding lower bound for the progression of the minimal position can be established 
from an intermediate result of Berard and Gouere [2] , with their proof adapted for branching 
Brownian motion. 

Proposition 1.13. Consider a (standard) one- dimensional Brunet-Derrida system with N 
particles, ordered by decreasing fitness Xi(t) ^ . . . ^ Xjsr(t). For all rj > 0, there exists c r] > 
such that for any initial condition as N — > oo, 

X N (t)-X N (0)^ (V2 - ^ ( | o ^ 2 ) V t < Cr? (log AO 3 ) -» 0- (12) 
1.2 Discussion and open problems 



Long term behaviour for general fitness functions. Theorems 1.1 and 1.4 establish the long- 
term behaviour for the cloud of particles for the two special cases where the function s is 
either the Euclidean norm or a linear function. In both cases, the cloud escapes to oo at 
positive speed in a possibly random direction. It would be interesting to see how general a 
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phenomenon this is. For instance, assume that s : — > R is a smooth, unbounded convex 
function. What can be said about the long-term behaviour then? One first observation is 
that the cloud of particles should essentially stay concentrated on level sets of the function s. 

Genealogy. In both cases studied here (Euclidean case or case A, and linear case or case 
B), we observe that the population lines up on an essentially one-dimensional subspace of R d . 
For truly one-dimensional systems, it is predicted that the Bolthausen-Sznitman coalescent 
describes the genealogy of a sample from the population, after rescaling time by (log iV) 3 . It is 
therefore reasonable to predict that the same property will hold in higher dimensions as well, 
at least in cases A and B and perhaps more generally as well, suggesting that the Bolthausen- 
Sznitman coalescent is a universal scaling limit in all dimensions, subject to assumptions on 
the function s. 

Equilibrium shape in one dimension. Consider the empirical distribution of a (standard) 
one-dimensional Brunet-Derrida particle system. 



N 



ii= 



and the associated cadlag empirical tail distribution 

1 - 

n=l 

It is not hard to see that the system of particles, viewed from the minimum position at time 
t, has regeneration times and therefore F N (t,x + Xj^(t)) converges pointwise to some limit 
distribution F^ q {x) as t — > oo, wherever is continuous. It is natural to start the particle 
system in some initial condition distributed according to F^t and ask for its properties. We 
believe, but have been unable to prove, that F^ q satisfies Q. In fact, we make the following 
conjecture about F~*. 

Reasoning by analogy with the results of Durrett and Remenik [13J, and using the mar- 
tingale problem for the empirical distributions of a free branching Brownian motion (see for 
example, Lemma 1.10 in Etheridge |14|). we expect F^(t,x) to converge in distribution to 
F(t, x), the solution to the free boundary problem: 

13 2 F rt lu 
-W = 2dx^ + F ^ x) V *>^)' (13) 
F(t,x) = 1 V x 7(i), 

where 7 : [0, 00) — > R is a continuous, increasing function starting from 0, which is part of 



the unknown in (13). (Note that Durrett and Remenik's argument breaks down for particles 
that perform Brownian motion, as it is essential in their coupling that particles sit still in 
between branching events. It is unclear how to adapt their argument to the case of Brownian 
motion). The first equation is simply the linearised FKPP equation ([2]), which is satisfied 
asymptotically as x — > 00 by the distribution tail of the position of the rightmost particle in 
a (free) branching Brownian motion. The second equation on the other hand represents the 
effect of selection, and j(t) then describes the limiting position of the minimal particle. [13| 



shows the existence of a family of travelling wave solutions for a class of problems similar to 
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Here the traveling wave solutions can be found explicitly: if F(t, x) = W(x — ct) solves 
we find 

-cW' = -W" + W. 

This is a second order differential equation which, as is well known, has positive solutions 
only if the speed c of the traveling wave satisfies c y/2. For c = v2 the solution is 

W*(x) = (V2x + l)e- V * x . (14) 



Turning back to F^ q , stationarity suggests that F^i is in the limit as iV -> oo a traveling wave 
solution of (13). But by Proposition 1.12 if FjY is a travelling wave solution the speed would 



have to be at most y/2, and so equal to \/2. Therefore, we conjecture that 

F e N g (x)^W*(x) (15) 

uniformly on compact sets as N — > oo. 

Equilibrium shape in high dimensions. Let d ^ 1 and fix an arbitrary smooth selection 
function s. For reasons similar to above, it is possible to define a notion of limiting equilibrium 



shape of the system as t — > oo. Theorem 1.7 gives information about the dimensions (width 
and length) of the limiting shape in case B. However, an inspection of the simulations suggests 
that particles are far from uniformly distributed within that shape. In the direction A, we 
expect the density of particles to be close to W*(x) for the same reasons as above. In the 
transverse direction X 1 - however, particles appear somewhat 'clustered'. Indeed, this is to be 
expected given the hierarchical structure of the Bolthausen-Sznitman coalescent. Clusters 
of particles represent groups of particles coming from a close common ancestor. However, 
clusters are also intertwined because of heat kernel smoothing. It is an interesting question 
to identify the density of particles at equilibrium. 



1.3 Biological applications: the effect of recombination 

As alluded to in earlier parts of this introduction, our Brunet-Derrida system in more than 
one dimension can be thought of as a model for the effect of selection on multiple linked loci. 
In this interpretation, we track the fitness of not one but d loci in a population of size N. 
Each particle corresponds to one-half of an individual's genetic material, and each of the d 
coordinates of that particle represents the fitness at the corresponding locus. Her total fitness 
will then be a function of these d values, typically just the sum. In this interpretation, we are 
assuming that the total fitness of each particle evolves like independent Brownian motions 
and branch independently of one another, which is a simplification because in reality, two 
particles - making up one individual - will branch simultaneously. For the same reasons, 
whereas in our model we only remove one particle at a time, it would be make more sense 
to remove two particles at once (also making up an individual). But we choose to ignore 
the correlations between an individual's two genetic halves, and still believe that the model 
captures some essential features of reproduction. Note that, as specified above, the model 
ignores the possibility of recombination. But we will precisely explain the effect of adding 
recombination to the model in a moment and show that it leads to an increase in overall 
fitness. 

It has been a longstanding problem in evolutionary biology to explain the ubiquitous 
nature of diploid populations over haploid populations. Indeed, in diploid populations the 
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t > X\ 

Figure 3: In the presence of recombination, the offspring of two individuals with positions 
(#i,yi) and (#2,2/2) is either (#1,1/2) or (x2,yi)- If the fitness across loci is negatively corre- 
lated (i.e., if the shape of the cloud is elongated in the direction transverse to fitness gradient), 
this leads to an overall increase in the variance of the fitness distribution, even though the 
mean in unchanged. In turn, this results in increased response to natural selection. 



chance of a particular gene being transmitted to an offspring is only 50%, whereas it is 100% in 
haploid populations! This would suggest that haploid populations are far more advantageous 
from the point of view of a particular gene. This paradox was in fact raised soon after the 
introduction of Darwin's theory of natural selection and evolution. 

As early as 1889, Weismann [20] advocated that sex functions to provide variation for 
natural selection to act upon. However it is fair to say that no real consensus was achieved 
in the population genetics community, especially after influential arguments by Williams [21 
raised doubts on Weismann's theory. The controversy reached the point where understanding 
the advantage of sexual reproduction became the "queen of problems in evolutionary biology" 
P]. We refer to Burt |12j for an excellent and highly readable survey of this question. 

In his study of the problem, Burt [12] observed that his models led to a negative correlation 
between the fitness on the two chromosomes, which is equivalent to a cloud of particles being 
spread out in the direction orthogonal to the fitness gradient (see Fig. ID of [H]). He then 
reasoned that a small amount of recombination would lead to a reduction in this correlation 
and greater variance in the overall fitness, ultimately leading to a fitter population, as can 
be seen on Figure [3j Thus Theorem 1.7 can be viewed as a rigorous justification of the 
Weissmanian proposal in this setting. 

Acknowledgements. We are grateful to Julien Berestycki for a number of fruitful con- 
versations at several stages of this project. In particular we learned of the potential relation 
between multidimensional Brunet-Derrida systems and the role of recombination from him, 
and we are grateful to have been shown a draft of [6] which raised that issue. 
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2 Proof of Theorems 11.11 and 11.4 



We are now ready to give a proof of Theorem 1.1 and 1.4 In this section N > 1 is fixed. 

We first begin with a formal construction of the Brunet-Derrida particle system. Let 
(Ji)i^o be the jump times of a Poisson process with rate N with Jo = 0, and let [Ki)%^\ be 
an independent sequence of i.i.d. uniform random variables on {1, . . . , N} . The process is 
started in some given initial condition. Then inductively, for each i ^ 1, assuming that the 
system is defined up to time Jj_i with s(Xi(Jj_i)) ^ . . . ^ s(X^( Ji-i)), we define 



X n {t) — X n (Ji_i) + Z n (t — Ji-i), t £ [Jj-i, Ji 



(16) 



where (Z n (t),t ^ 0) are independent Brownian motions in W 1 , independent from (Ki) and 
(Ji). At time Ji, we duplicate particle X^(J~) and remove the particle mini<j„,<gjv s (X n (J^))- 
Note that if the duplicated particle is the particle of minimal score, the net effect is that 
nothing happens. We now relabel the particles over this interval in the usual convention of 
descending fitness so 

s(Xx(t)) > s(X N (t)), t e [Ji_i, Ji}. 



2.1 Proof of Theorem 11.41 

We start with a few elementary facts about (free) branching Brownian motion X\ (t) , . . . , Xjj^ (t) 
in M, where N(t) is the number of particles at time t. In keeping with our convention for this 
article we order particles from right to left. We assume that initially there is one particle at 
the origin. 

The following lemma is a trivial but useful result to relate the statistics for all the particles 
alive in a free branching Brownian motion to a single Brownian motion and is sometimes 
known in the literature as the many-to-one lemma (see for example |15j). 

Lemma 2.1. LetT be a random stopping time of the filtration Tt = cr(Xi(s),i ^ N(t),s ^ t), 
and assume that T is almost surely finite. For s < T and each i ^ N(T), let Yi(s) be the 
position of the unique ancestor of Xi(T). Then for any bounded measurable functional g on 
the path space C([0,co)), 



E 



9((Yi(s)UT) 



E[e T g((B s ) siiT )}, 



where (B s ) s ^q is a standard Brownian motion. 



With the many-to-one lemma, we can obtain a naive bound for the maximum displacement 
of a particle at time t from its parent at time 0, as well as the running maximum. 



Lemma 2.2. For any K > 0, 



Moreover, 



F(Xi(t) ^ y/2t + K) ^ e~^ K . 
P(supXi(s) ^ V2t + K) ^ 2e-^ K . 
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Proof. By Lemma 



2.1 



i(i) ^ y/2t + K) <E ^ ^ \/2t + #} 

J<N(t) 

= e*P(Si ^ V2t + K) 

where we use the well known tail bound for a standard normal random variable X , and a > 0, 

2 

P(X ^ a) ^ e - ^. 

For the historic maximum, a similar argument shows 

P(supXi(s) ^V2t + K)^ e*P(supB a ^ V2t + iT). 

Using the reflection principle, 

P(sup B s ^ V2t + if) = 2P( J B t ^ \/2t + if ), 

and the result follows. □ 

When a Brunet-Derrida system is driven by a score function s(x) = (x, A), where A G §> d_1 , 
we have already noted that after projecting the particle system onto Span(A), we recover a 
standard one-dimensional Brunet-Derrida system. For such systems, we have an easy but 
useful coupling used by Berard and Gouere [2]. 

Lemma 2.3. Consider two (standard) one- dimensional Brunet-Derrida systems, (X n (t), 1 ^ 
n ^ N)t^o and (Y n (t),l ^ n ^ N')t^o, N ^ N' , which are initially ordered X(0) -< Y(0) in 
the sense of stochastic domination: that is, there is a coupling of X(0) and Y(0) such that 

Then we can couple X(t) and Y(t) for all time such that X(t) -< Y(t) for all time t ^ 0. 



Proof. Construct X(t) and Y(t) using (16) with the same jump times (Ji)i^o an d the same 
family (Z n (t),t ^ 0) of independent Brownian motions in K. □ 

Adapting the (easy) proof of Proposition 2 of [2] one obtains: 

Lemma 2.4. Consider a one- dimensional Brunet-Derrida system initially with Xi(0) ^ 
. . . ^ X N (0) = 0. Then 

Xi(t) _ 

almost surely, where vn > is a deterministic constant. 



The argument is based on the monotonicity of Lemma 2.3 and Kingman's sub additive 
ergodic theorem. To see that vjy > for N > 1, we observe that v\ = and that there 
is a straightforward strengthening of Proposition 3 of [2] to see that (vn,N ^ 1) is strictly 
increasing. 

The same argument also applies to X^{t), but a priori the limiting velocity v' N might be 
distinct from vn- In fact the following lemma, which can be proved in the same fashion as 
Proposition 1 of of [2], shows that = v' N . 
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Lemma 2.5. Let (X n (s),l ^ n ^ N) s ^q be a (standard) one- dimensional Brunet-Derrida 
system. Then for all e > and t > (1 + k) logiV for some k > 0, 



lim F lXi(t) -X N (t) ^ (3\/2 + e)logiV 



Corollary 2.6. For all N > 1 and e > 0, 

'X 1 (s)-X N (s) 



0. 



lim ] 

s— »oo 



> e 



0. 



Going back to a Brunet-Derrida system in M d , let if = {x 6 M rf : (x, A) 
orthogonal hyperplane to A and let pn be the orthogonal projection onto H. 



0} be the 



Referring back to the construction of the system via (16), conditional on (where 
Tt is the filtration generated by the whole system up to time t), particles perform (d — 1)- 
dimensional Brownian motion on H independent of the motion in Span(A) up to time J, for 
every i ^ 1. Moreover, since s(x) = (x, A), pH(X m (Ji)) is independent of the event that 
the particle X m survives a branching event at time Jj. Together, these two properties imply 
by induction that the path of a particle conditioned to survive until time t when projected 
onto H has the law of a standard (d — l)-dimensional Brownian motion. In other words, 
if X n (t) is a surviving particle at time t and Y n (s) is the ancestor of X n (t) at time s ^ t, 
then (p}{(Y n (s)), s ^ t) is a standard (d— l)-dimensional Brownian motion. Therefore, for all 
1 ^ n ^ N, 



\X n (t) 



H 



\\PH(Y n (t))\\ 



t t 

almost surely for all 1 ^ n ^ N . Therefore 



max 

l<n,m<N 



\X n (t) - X m (t)\\ H 



^ 2 max 



IX, 



0, 



H 



Together with Lemma 2.4, this completes the proof of Theorem 1.4 



□ 



2.2 Proof of Theorem flTTI 

Assume now s(x) = ||x||. Recall in this setting, for X n (t) ^ 0, we write X n (t) = R n {t)Q n (t) 
where R n {t) > and n (t) G is continuous whenever X n (t) is continuous. Note also 
for d ^ 2, d-dimensional Brownian motion almost surely never hits 0. Hence, except for 
any particles initially at 0, this decomposition is always well-defined. We can work around 
particles starting from by instead taking the system at time t > as its initial state without 
altering the proofs. Therefore, without loss of generality, we shall assume from here on that 
-Rjv(O) > and we need not worry about any particles at 0. 

When considering (R n (t),l ^ n ^ N), we can work in a one-dimensional setting and 



construct the system in a similar manner as before, except now the displacement step (16) 
becomes 

R n (t) = R n {J,^ 1 ) + S n (R n (i-l),t-J i . 1 ), teiJi-uJi), (17) 

where (S n (r,t),l ^ n ^ N) are an independent family of solutions to the Bessel stochastic 
differential equation 

dS(r,t)=dB(t) + ^^-dt, (18) 
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where B(t) is a standard one-dimensional Brownian motion, and S(r,t) is a solution starting 
from S(r, 0) = r. In this construction, we see immediately that S(r, t) is stochastically 
decreasing in r and that as r — > oo, S(r, t) converges to a standard one-dimensional Brownian 
motion. 

Recall that the default ordering is in descending fitness, and so R\(t) ^ ... ^ Rw(t). 
Lemma 2.7. For all N > I, 

t ' 

as t — ^ oo almost surely. Moreover, 

Ri{t) 
t 

almost surely, where vn > is a deterministic constant. 



VN, 



Proof. Given (R n (t),l ^ n ^ N) constructed in the usual manner and with (17), we define 
a family of one-dimensional Brunet-Derrida systems (Y^(t), 1 ^ n ^ N) constructed in the 
same manner as (R n (t), 1 ^ n ^ N) with with the same (J,), (Ki), but with the displacement 
step 

Yn{t) = y^(Jj-i) + W%{t - Jj-i), telJ^.Ji), (19) 

where (W^(t),l ^ n ^ N) are independent Brownian motions in R with e drift. These 
processes satisfy the stochastic differential equation 

dVr(t) = dB(t) + edt. (20) 

Suppose we couple the family (Y^(t), K n ^ iV) to (R n (t), 1 ^ n ^ TV) by using the same 



underlying B{t) to drive the solutions to (18) and p0[ ) for each i and n. Then we see that 
under this coupling 

lim inf — N ^ - lim inf ^ - . 

t— >oo t t— >oo t 

But (Y^(i), 1 ^ n iV) is a standard one-dimensional Brunet-Derrida system and by Lemma 
I2T 



lim = 

t— >oo t 

almost surely for some deterministic constant vn > 0. Therefore, almost surely, i?jv(i) ~~ ^ °°- 
So under this coupling, for every e > 0, 

fli(t) , Y?(t) 
lim sup ^ hm sup 

t— >oo t t— >oo t 

almost surely. However, we note that if the drift is a constant equal to e, then Yf(t) 
Y®(t) + et for all 1 ^ i ^ iV and all t ^ 0, hence by Lemma 



2.4 



lim = vn + e, 

t— >oo t 

almost surely. Since e > was arbitrary, we have that almost surely 

,. Ri(t) .. R N (t) 

hm = lim = vn. 

t— >oo t t— >oo t 



□ 
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Having established the asymptotic behaviour of for R(t), we now turn our attention to 
Q(t). The main idea here is that the time to the most recent common ancestor for all N 
particles, can be naively dominated uniformly over all time. We shall formalise this statement 
with the following lemma. 

Lemma 2.8. Let r(t) be the time to the most recent common ancestor for X±(t), . . . ,X^(t). 
Then for all t sufficiently large, r(t) — 1 is stochastically dominated by a geometric random 
variable of parameter p, where p > 0. 

Proof. Let s ^ be an integer, and consider the system X\(s), . . . , Xjv(s) at time s. We 
assume that the Brunet-Derrida system (X n (t),l ^ n ^ N)t^s is obtained from a free 
branching Brownian motion (X n (t), 1 ^ n ^ N(t))t^o i n the obvious manner. Let A s be the 
event that, for this free process, when the particle located at -^i(s) at time s first branches 
after time s > 0, its score is ^ R\{s) + 1, and it subsequently produces at least N offspring by 
time s + 1 whose score always stays above R\(s) + 1/2. Let B s be the event that for the free 
process, the particles initially located at A^s), . . . ,Xn(s) do not branch before time s + 1 
and that 

sup sup \\Y n (t) -Y n (s) || ^ 1/2, (21) 

2^n^N te[s,s+l] 

where Y n (t) is the location at time t of the descendant of the particle located at X n (s) at 
time s. Note that Y n (s) is well-defined since X n (s) has a unique descendant for 2 ^ n ^ JV. 

Note that A s and B s are independent events. Moreover, B s is independent of X(s), so 
there exists p2 > such that 

F(B S \T S ) = ¥(B S ) > P2 

almost surely for all s, where J- s denotes the filtration generated by the entire process up to 
time s. Likewise, A s given R\{s) is independent of F s - To lose the dependence on Ri(s), 



we use an analogous coupling as in the proof of Lemma 2.7 where we stochastically bound 
(R n (t), 1 ^ n ^ N) from below by a standard one-dimensional Brunet-Derrida (Y®(t), 1 ^ 
n ^ N). We define the event A' s to the event that, for a one- dimensional free branching 
Brownian motion, a particle located at R\{s) first branches after time s > 0, its score is 
^ Ri(s) + 1, and it subsequently produces at least N offspring by time s + 1 whose score 
always stays above Ri(s) + 1/2. We now have that A' s is independent of Ri(s) and therefore 
and 

\T S ) > F(A' 8 \T a ) ~- 



So there exists p\ > such that 



almost surely for all s. We call p = P1P2 > 0, and deduce from the above that if G s = A s nB s , 

¥{G s \F s )>p 

almost surely for all s. Note that when A s n B s occurs, all the particles at time s + 1 in 
the Brunet-Derrida system necessarily descend from the maximum particle at time s. Hence 
t(s + 1) ^ 1. 

Applying this argument iteratively, we deduce that 

F(r(t) >k)^ F(Gl k n Gl k+1 , n . . . n Gii) 

from which the result follows. □ 
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X m (s) 



D(Q m (s),e n (t)) 




x n (t) 



Figure 4: Proof of (22). The angle is maximised when the triangle formed by 0, X m {s) and 
X n (t) is rectilinear. 



With Lemma 2.8, we are now in a position to complete the proof of O) with the following 
lemma. Endow S d_1 with the usual spherical metric D: for 0i,02 £ S 1 , let D(Qi,Q>2) be 
the distance on the sphere. In 



Lemma 2.9. For all N > 1, 



as t — > oo almost surely. 



d(@ 1 ,@ 2 ) = co S - 1 (ei,e 2 ) 



max D(e m (t),9„(t))^-0 



Proof. Given two particles X TO (s),X n (t) G M d , let r = ||X m (s) — X n (t)|| and assume for now 
that r ^ Rn(t) = ||X n (i)[|. Then a simple geometric argument (see Figure [4j shows that the 
distance D(@ m (s) , Q n (t)) is biggest if X m (s) is perpendicular to X m (s) — X n (t). Hence for 
r < R n (i), 

^WAWX-n- 1 ^)^, (22) 

since sin _1 (x) ^ |x for all ^ x ^ 1. 

Given < A < t and let r = r(t) be the time to the most recent common ancestor of all 
the surviving particles at time t. We define r = oo should there be no such ancestor. We first 
note that 

P(r ^ A) < (1 -p) A 



where p is as in Lemma 2.8 Hence picking A = C\ log/; for some sufficiently large C\ > 0, 
and applying the first Borel-Cantelli lemma shows that there exists T\ > 0, possibly random, 
such that almost surely, r ^ A for all t > T\. 

On the event {r ^ A}, let X k {t — r) be the position of the most recent common ancestor 
of all the surviving particles at time t. Since supD ^ ir, using (22), we have: 

D(@ m (t), e n (t)) < D(e n (t), e k (t - r)) + D(e m (t), e k (t - T )) 

R k (t- r) 1{p ^ Rk{t " T)} + > ^ " T)} 

< m^-y (23) 
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where p = sup us g T maxi^ n <gAr ||-^n(* — t + u) — Xk{t — r)||, which on the event {r ^ A} can 
be dominated by 

p < sup sup ||Z n (s)||, 

s^A n 

where (Z n (u),l ^ n ^ N(u)) is a d-dimensional branching Brownian motion started from 
one particle at 0. Writing for all n and u, 

Z n (u) = (Z^(u),...,Z { n d \u))eR d , 

we have that Z^\s), . . . , Z^(s) are one-dimensional branching Brownian motions and so 

> V2dA + C 2 dlogt, t < A) < 2dP(supsupzJ l 1) (s) > V2A + C 2 logt) 

ssgA n 
-V2C 2 logt 



^Me- vz ° 2lost , (24) 



where (24) follows by Lemma 2.2 This is also summable for sufficiently large C2, hence we 
deduce that almost surely there exists T 2 > 0, possibly random, such that if t > T2 then 
r < Ci log t and p ^ C3 log t where C3 = \[2dC\ + dC2 ■ 

Then for t > T2, applying (23) and Lemma 2.7 there exists some C4 > such that 



R k (t-r) v N {t-d\ogt) 
The right hand side tends to as t — > 00 uniformly over m, n, so almost surely 

sup D(e m (t),e n (t))^o, 

as desired. □ 



Note that by Lemma 2.8, the system eventually has a unique most recent common an- 
cestor. We also observed that almost surely, the time of the most recent common ancestor 
t — r(t) —7- 00 as t —7- 00. If we consider the genealogical path of the most recent common an- 
cestor, we see that there is a unique immortal genealogical path in the system, or the 'spine', 
from which all the particles that are eventually ever alive in the system descend from. 

Let X* (t) be the particle of the spine at time t and take the usual decomposition: X* (t) = 
i?*(i)G*(£) where R*(t) > and 0*(i) € S d_1 is continuous. We now complete the proof of 
Theorem |1.1| by showing the angular part of the spine converges. 

Proposition 2.10. For all N > 1, 0*(t) converges almost surely as t — >• 00. 

We offer two proofs of this proposition. One is shorter but relies explicitly on stochastic 
calculus, and hence works only for the exact situation described in this paper. On the other 
hand the second proof is a bit longer but more robust; in particular it carries over to slightly 
more general Brunet-Derrida particle systems than the ones we consider in this paper: see 
Remark EH1 

First proof of Proposition 2.1(\ We start by recalling the classical skew-product decomposition 
of Brownian motion (see for example Section 7.15 of [E]). The version we present here is 
Theorem 1.1(d) of [EJ. 
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Let (X(t),t ^ 0) be a d-dimension Brownian motion, and write X(t) = R(t)Q{t) with 
R(t) > and 0(t) G and R, 6 continuous. Let 

i? t = / R( s y 2 ds. (25) 

JO 

Then, 

(i) (R(t), t ^ 0) is a Bessel process of order d. 

(ii) Under the time change &(Ht) = O(t), (3>(i), £ ?S 0) is a Brownian motion on S . 

(iii) ($(t),t ^ 0) is independent of i ^ 0). 

In the special case of d = 2, we can write Q(t) as e* 5 ^') where {B(t),t ^ 0) is a standard 
Brownian motion in R indepedent of (R(t),t ^ 0). We note here that in this case 3>(t) = e lB ^ . 

Now consider the system (X(t), f 0) that results from not enforcing selection: the under- 
lying free d-dimensional branching Brownian motions started from ./V particles at Xi(0), . . . , Xn(0) 
coupled to the Brunet-Derrida system. It is clear that this can be constructed by considering 
the skew product decomposition of every Brownian path in the system Xi(t) = Ri(t)Qi(t) = 
Ri(t)$>i(Hi(t)). This is a bit cumbersome, but here are the details. 

Let T be the underlying branching tree (which by assumption is just an ordinary Yule 
process). We use Neveu's formalism for binary trees, i.e., T is a set of vertices given by 
T = U^L {0, l} n and each vertex v has attached to it an independent exponential random 
variable of mean 1, X v , representing the lifetime of this individual. We call the interval 

of time over which this particle is alive, thus t v — s v = X v and so s v = Yl w -<v -^w (with w <v 
means w is ancestor of v). We also attach to each v a Bessel process R v (t) defined over the 
interval of time [s„,£u] m ^ ne natural way, by solving the SDE 

dR v (t) = dB v (t) + }.M ,te [s v ,t v ] 
ZK v {t) 

where the Brownian motions B v are independent for different vertices v, and by requiring 
continuity of the resulting Bessel process when we move up along the branches of the tree. 
We extend the definition of R v {t) to the entire interval [0, t v ] simply by defining R v (s) = R w (s) 
where w is the unique ancestor of v alive at time s (i.e., such that s G [s w , t w ]). 

We further enrich this structure by associating to each vertex v an angle process @ v (t), 
also defined over the interval of time [s v ,^] ; which is defined by applying the construction 



(25) in between two successive branching events. More precisely, let 

H v (t)= [ R v { S y 2 ds, 
Jo 

let s' v = H v (s v ) and t' v = H v (t v ). Consider a family of Brownian motions on (& v (t),t G 
[s' v , t' v ], v G T) on S d_1 such that the evolution of & v over [s' v , t' v ] are independent for different 
vertices v G T. As above we extend <fr«(i) to the interval [0,t' v ] by defining $ v (t) = <& w (t) 
where w is the unique ancestor of v such that t G [s' v , t' v ], and we have chosen so that $> v (t) 
is a continuous function of t over [0, t' v ] for all v G T . We now define Q v by the formula 

e v (t) = $ v {H v {t)), 
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for s v ^ t ^ t v . 

Let S(t) be the set of particles alive at time t, i.e., the set of vertices v G T such that 
t G [st),t«]. Let N(t) = \S(t)\ and order the vertices in S(t) by ui, ... , Vj\r(t) m such a way 
that i? x (i) ^ R 2 (t) > where = fl^i). Also let 8i(i) = 9 C4 (t) for 1 < i < N(t). 

Then our system of branching Brownian motion then consists of 

Xi{t) = Ri(t)@i(t), 1 < i ^ iV(i), t ^ 0. 

Having described the skew product decomposition of a free branching Brownian motion 



(Xi(t),t ^ 0, 1 ^ i ^ N(t)), we proceed with the proof of Proposition 2.10 By a ray we mean 
a sequence V = {v i, t>2, • • •} such that w n is in generation n of the tree and v n ^ f n +i for all 
n 0. For each given ray V, we can follow the trajectory Xy(t) of the Brownian motion 
associated with V, that is, Xy(t) = X v (t) for the a.s. unique v £ V such that t G We 
can also consider -Ry(i) = its radial part and Qy(t) = @ v (t) its angular part. Observe 

then that we have, by construction, Qy(t) = <&y(Hy{t)) where &y is a Brownian motion on 
S^- 1 and H v (t) = f*R v (s)- 2 ds. 

Now, consider the set V of rays V such that 

If V G V, fly (t) = Jo «y(s)- 2 converges almost surely as i — > oo to a limit Hy (oo). Hence 



8y(i) converges as t — ^ oo to $v(fly(oo)). By Lemma 2.7, -X*(t) almost surely is such a 



path in V and so 0*(i) converges as i — >• oo to a limit. □ 



Second proof of Proposition 2.10. Our second proof relies on a suitable martingale argument 



rather than stochastic calculus, and hence is more robust. See Remark 2.13 for a discussion of 
the setups to which it carries. Consider a free branching Brownian motion X = (Xi(t), 1 ^ i ^ 
N(t),t ^ 0), and write Xi(t) = Ri(t)@i(t) for t ^ and 1 < i < N(t). Let = a(Ri(u), 1 < 
i < iV(u),u < t) and let jf = cj(0i(n), 1 ^ i ^ F(«),tt < i). Let & = a{F^ U J^ ), and 
note that (0*(t),t ^ 0) is adapted to the filtration (Q s ,s ^ 0). 

We start by explaining the argument in the case d = 2, which is a bit simpler to describe. 
Recall in the case d = 2, we can write X*(t) = i2#(t)e*^*w, where R*(t) > and 0*(t) is a 
continuous function. This way of writing X*(t) is unique modulo a global constant multiple 
of 2ir in 9*(t), which we fix once and for all at time 0. 

Lemma 2.11. t ^ 0) is a martingale with respect to (Qt,t 0). 

Proof. It is a simple exercise left to the reader to check that 6* (s) is integrable. Let s > and 
suppose at time s there is a particle at position z in this process, say Xj(s) = z. Consider 
the transformation T = T z which is a reflection in the line Mz: 

T z {x) = 2{x,z')z -x,x G R d 

where z' = z/\\z\\. Note that T z is an orthogonal transformation and hence Wiener is in- 
variant under T z . We apply T z to every descendant of the particle Xi(s), and call T(X) = 
(T(Xi(t)), 1 ^ i ^ N(t),t s) the resulting transformation of all the particles in the branch- 
ing Brownian motion. We note that since each T z leaves Brownian motion invariant, T(X) 
has also the law of a free branching Brownian motion. Moreover, T z is an isometry so we 
have \\T(Xi(t))\\ = Ri(t) for all t ^ and all 1 ^ i ^ N(t). In particular, a particle T(X~i(t)) 
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survives the selection procedure if and only if its mirror image Xi{t) does. In particular, the 
branching times and tree structure of the system are invariant under T. 

These two properties imply that, conditional on Q s , T(X) has the same distribution as 
X. On the other hand, observe that if X(t) has a particle at x descending from a particle at 
z at time s, then 

argT(x) = argT z (x) = 2argz — argx. 
Applying this to z = X*{s) and x = X*(t) shows that 

E[e*(t) - e*( s )\g s ] = e[0.(«) - e*(t)\g.\ = o, 

as desired. □ 

When d ^ 3, it is necessary to first project onto a two-dimensional subspace II before 
applying a similar reasoning. Let II be a given such plane and let pn be the orthogonal 
projection onto II. For some fixed e G II and x G M. d , define arg n (x) to be the continuous 
directed angle between pu(x) and e. 

Lemma 2.12. (arg n (X*(t)), t ^ 0) is a martingale with respect to (Ot,t 0). 

For z G R d , let T z (x) be defined by 

T z (x) =x- 2( Pn (x) - ( Pn (x), z')z'),xe R d , 

where z' = pn(z)/\\pn{z)\\. More descriptively, if x = u + v where u G II and v is orthogonal 
to II, then T z (x) = T z (u) + T z (v), where T z (v) = v and T z (u) is the reflection of u in the line 
IRPnC 2 ) within the plane II. As before, applying this transformation to each descendant of a 
particle located at z at time s yields a transformation T of the branching Brownian motion, 
which leaves the modulus of particles ||T(Xj(t))|| = ||Xj(t)|| unchanged, and leaves the law 
of branching Brownian motion also unchanged. But the choice of T gives arg n (T(x)) = 
26n{z) — arg n (x) if x descends from z. Thus 

E[arg n (X^)) - arg n (X,( S ))|g s ] = E[arg n (X*(s)) - arg n (X*(t))|&] = 0, 

as above. This concludes the proof of the lemma. □ 



We are now ready to conclude the second proof of Proposition |2.10| It suffices to prove 
that 9*(t) converges as t — > oo. We can assume without loss of generality that d = 2, as it 
suffices to show that arg n (X*(i)) converges as t — > oo for any fixed arbitrary two-dimensional 
subspace II. Thus we will assume d = 2. 

Let t > and set s = \t] — 1. Define p*(t) = sup ug j s t i ||X*(n) — X*(s)|| and define a 
stopping time T to be the first time t such that p*(t) ^ R*(s). Let 0j(t) = 9*(t A T) be the 
martingale 9* (t) stopped at T. The reason for stopping at T is to ensure a bound similar to 



(22) holds for 9j(t). The precise bound is 



m« + i)-^{»)\<^rr<l (26) 



Since 9 T is a martingale, 



,o 4 ~L R*{s? 



mitf] = £ E[(9^s + 1) - 9j(s)f] ^Y,^ E 



(27) 
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We observe that R*(s) ^ ||Xjv(s)|| and proceed to bound from above ||Xat(s)|| stochastically. 



Using the coupling used in the proof of Lemma 2.7 we have that ||XAr(i)|| dominates the 
minimum at time t of a standard one-dimensional Brunet-Derrida system started from N 
particles all at the origin. In turn, this dominates S(t) = Z{1) + . . . + Z{t) where Z(i) are 
independent and identically distributed as the position of the minimum at time 1 of a one- 
dimensional Brunet-Derrida system started from N particles all at the origin by the monotone 
coupling of Lemma |2.3| 



Let m = E[Z(1)], and note that for N > 1, m > for the same reason vn > in Lemma 
2.4 Furthermore, Z{1) is the minimum of a finite number of Brownian motions at time 1 
and hence Efe^^ 1 )] < oo for all A ^ and so = logE,[e~ xz ^] is well-defined. Then for 
any A ^ 0, 

P (S(t) \mt) sC P (V A5 ^ ^ e -^ Xmt \ 

<: e^ Xmt E[e- xs(t) ] exp(t/(A)), 

where /(A) = -Am + ip{\). We note that /(0) = and /'(0) = ip'(0) + \m = -(m and that 
for A sufficiently small, /(A) ^ ^A/'(0). Therefore, 

P(5(t) < \mt) ^ exp(-iAmt), 



Therefore, by Jensen's inequality and since x i— > x A 1 is concave, 



E 



P*(s + 1) 2 
||^(*)|| 2 



sC E 



1a 4 ^V 1} )MII-Vv(-s)|| > „,.V2} 



+ F(\\X N (s)\\^ms/2) 



m 2 s 2 



It is not hard to see that there exists Ci > depending on iV but not s such that E[p*(s+1) 2 ] ^ 



C\. Therefore, plugging into (27) we see that 

for some C 2 > and so (6f(t),t^ 0) is a martingale bounded in L 2 , and so converges almost 
surely. 

Obviously, this implies convergence of 0* almost surely on the event {T = oo}. Therefore 
it suffices to check that that almost surely p*(t) ^ R*(s) eventually never happens. But note 
that since Efp^t) 2 ] ^ C\ < oo it follows from Markov's inequality and the Borel-Cantelli 
lemma that p*(t) < «jv(|Y| — l)/2 for all t sufficiently large, and hence p*(t) ^ R*(s) for all t 
sufficiently large by Lemma |2.7[ Thus 0*(t) converges almost surely as t — > oo. □ 



Remark 2.13. In this paper we have concerned ourselves for simplicity with branching Brow- 
nian motion with selection. However, there are a variety of possible alternatives: for instance, 
initially, Brunet and Derrida considered a system where branching occurs at discrete time 
steps t = 0,1,..., and at each t, each particle branches into two (or possibly even more) 
individuals, and the displacement follows a random walk with a given distribution. Yet an- 
other alternative, taken up by Durrett and Remenik, is to have particles branch at rate 1 in 
continuous time. 

As is plain from the above proof, Theorem \l.l\ remains true in each of these cases, under 
the assumption that the displacement of particles is rotationally symmetric and second moment 
on the random walk jumps. 
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3 Proof of Proposition 



1.12 



Consider a standard one-dimensional Brunet-Derrida particle system with N particles, started 
from an initial configuration satisfying ([8]). For ease of notation, we will assume without loss 
of generality (since the system is translation invariant) that x = 0. 

The key idea of the proof is to compare the Brunet-Derrida system to the free branching 
Brownian motion where killing occurs at a linear boundary. This is the idea which lies behind 
papers such as |4j, which used a wall of velocity y/2 — 2ir 2 / (log N) 2 in first approximation. 
We shall use the same speed for our wall, which we will call the right wall. More precisely, 
we define L = (logN)/y2, for some v > 1, y/e = ir/L and fi 2 = 2 — e and consider a moving 
linear boundary (L + fit, t ^ 0). 

There is a natural coupling to a free branching Brownian motion in M with particles 
Xi(t), 1 ^ % ^ N(t), ordered in the usual way right to left, obtained by ignoring any particle 
with index greater than N and their descendants. Note the key property of this coupling that 
for each 1 ^ i N, Xi(t) ^ Xi(t), with probability one. Therefore, under this coupling, 

F(X N (t) ^ fit) < F(X N (t) ^ (it). (28) 

We note here that our initial condition ^ allows us to prove the Proposition up to a 
constant (that does not depend on N) shift of the system. In other words, for any fixed 
C > 0, it suffices to show 

F[swp{X N (t)- fj,t} > (\ -+0, 

since 

N N 
i=l i=l 

which also satisfies Q for N sufficiently large. 

Let T = c,5(log./V) 3 where eg > is a small constant depending only on 5 which we will 
fix later on. The idea is that c$ will be small enough so that the event Vj that no particle 



ever hits L + \xt up to time t, has high probability for any t ^ T. (See Lemma 3.5). 

In view of this, let I(t) C {1, . . . , N(t)} denote the index set of particles that never touch 
position L + [is for any s ^ t. This corresponds to killing particles once they hit this position. 
Note that V t = {#/(*) = N(t)}. Let 

W t = l{Xi(t)Zpt} 
be the number of particles of the free branching Brownian motion with killing at L + fit, 



t ^ T, which are greater or equal to fit. Then by (28), we get 



F(X N (t) > fit) < F(W t >N)+ ¥{Vt) < + ) (29) 

by Markov's inequality. 

In order to estimate Wt we consider an additional wall (which we call the left wall) which 
also moves at velocity fi, and starts at position 0. We will treat separately the particles in 
I(t) that hit the left wall and those that do not. More precisely, let J(t) C I(t) denote the 
index of particles that never touch position L + fis or fis for any s ^ t. Thus the particles in 
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J(t) are killed when they hit either of two walls (the left and the right one) which both move 
at velocity fi, and start at position and L. 
Let 



W x {t) 
W 2 {t) 



£ HXi(t) > 1Mb}, 
ieJ(t) 

J2 HMt) > f,t}, 



ieK(t) 



def 



where K(t) = I{t) \ J(t), so that 

W t = W 1 (t) + W 2 (t). 
Lemma 3.1. There exists a universal constant C > such that for any t ^ T , 

E[Wi(i)] < Yl Ce^>(°\ 

i:X l (0)>0 

Proof. We apply a result due to Maillard (second part of Lemma 5.4 in [IS]). Let r G (0, L) 
and assume that initially there is one particle at x > 0. Then the number of descendants N x (t) 
of that particle that do not hit either the left or right wall, and which lie in [r + [it, L + fit], 
satisfies 

E[N x (t)] < Ce^ x ~ r \ 

for some universal constant C > 0. Letting r — > by monotone convergence theorem and 
summing over all initial positions x > of particles gives the result. □ 

It remains to treat particles that do hit the left wall. Let At be the number of particles 
that are killed on the left wall if we kill all particles that hit this wall. 



Lemma 3.2. Given A 



t, 



E[W 2 (t)\A t ] ^ e-2 £t A t . 



Proof Let J~t = o~(A s , s ^ t). For each particle killed on the left wall at some time s ^ t, the 
conditional expectation, given T%, of the number of descendants at time t that are greater or 
equal to fit is simply, by translation invariance, and the many-to-one lemma, 



t-s 



>H(t-s)). 



(30) 



Now, let 



fit) 



r °° dx 
c I e 2t — = 

I fit 



2vrt 



A quick calculation shows that since fi < \/2, f'(t) ^ so (30) is maximised at s = 0. Thus, 



summing over all the times s at which some particle dies touching the left wall (thereby 
increasing A s by one), yields 



E 



i£K(t) 



l{Xi(t) ^ fit} 



^ fJLt)A t 



Since A( is J-^-measurable, the result follows, after noticing that e t F(Bt ^ fit) ^ e l e' 
e~i et by well known bounds on the normal distribution tail. 



□ 
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Hence we have reduced the problem to estimating from above E[Aj]. To this end, we will 
distinguish between those that started at positive positions and those at negative positions, 
respectively K+(i) and K-(t). Call A+ (t) and A_(t) the corresponding number of particles 
killed at the left wall. 

Lemma 3.3. 

E[A+(i)] < Cet e>lXM 

i:Xi(0)>0 

Proof. Any particle that first hits the left wall from the right has to descend from an ancestor 
Xi(0) at time with Xj(0) > 0. We use the following very crude bound: during times t and 
t + 1, given Wi{t), 

E[A+(t + 1) - A+(t)|Wi(t)] < eWi(t). (31) 

This is because the number of particles killed on the left wall during [t, t + 1] cannot exceed 
the total number of descendants of particles i G J(t) such that Xi(t) > 0. Since the number 



of such particles is precisely W±(t), (31) follows. 



Taking expectations in (31) we have by Lemma 3.1 



E[A+(t + 1) - A+(t)] < eE[Wi(*)] < Ce ^ e"* i(0) , 

i:X l (0)>0 

Since A s is non-decreasing, the lemma follows by summing over the intervals [0, 1], . . . , [\t\ , \t]]. 

□ 

We now address A_(i). 
Lemma 3.4. 

E[A_(t)] < e^ £t e^ Xl(0) . 

i:Xi(0)<0 

Proof. Any particle that first hits the left wall from the left has to descend from an ancestor 
Xi(0) at time with Xj(0) < 0. The total number of such particles up to time t, A_(t), is 
exactly the number of particles of a branching Brownian motion with drift —fi that hit level 

by time t, started from the negative positions in the initial condition. 

Fix some constant A > 0, and consider a branching Brownian motion with drift —fi 
where every particle is stopped upon reaching and killed upon reaching —A. Initially the 
starting positions consists precisely of (Xj(0), 1 ^ i ^ N) whenever Xj(0) < 0. We call X*(t), 

1 £ N*(t), the corresponding particle locations. Let A^(t) be the number of particles stopped 
upon reaching by time t. Now consider the process 

M?= Y, WW + ^tW-sM'. (32) 
ieN*(t) 

Without stopping particles upon reaching 0, it is easy to check that (M^,s ^ 0) defines a 
nonnegative martingale (see e.g. Lemma 2 of [15j . or Lemma 6 of [1])). However, if we stop 
particles upon reaching 0, since 2 — fi 2 = e > 0, becomes a supermartingale. Therefore 

J2 Ae^^ )+A) > E[M A ] > E[M t A ] > A A (t)Ae^-i £ *. 

«:Xi(0)<0 
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So, making the cancellations, 



E [A A (t)] < e\ £t Y, eflXM - 

i:X t (0)<0 

Letting A — > oo and using the monotone convergence theorem concludes the proof of Lemma 

Ed □ 



With this supermartingale argument, we are also in a position to address Vt, the event no 
particle ever hits L + fit up to time t. 



Lemma 3.5. 



N 



i=l 



Proof. We use the same supermartingale (32) as in the proof of Lemma 3.4 except we now 



stop particles upon reaching L in the branching Brownian motion with drift —fi. Particles 
are still killed at —A. Let A^(t) be the number of particles stopped upon reaching L by time 
t. Then arguing as before 



N 



E[Af (t)](L + A)e^ L+A) -i £t ^ ^(Xi(0) + A)e 



n(Xi{Q)+A) 



i=l 



Since L > Xi(0), 



N 



i=i 



We note that F(Vp) = P(l i m A— >oo A^ (t) 1) and conclude by Markov's inequality and 
monotone convergence. □ 



Putting together Lemmas 3.3 and 3.4, we get 



E[A t ] < [ e^ £t Y + Cet E 

j:X;(0)<0 i:Xi(0)>0 



Combining with Lemmas |3.1| and |3.2[ this yields 



E[W t ] < e £t Y eflXm + + et ) C ^ et Yl e>lXm 

i:Xi(0)<0 i:X t (0)>0 

Note that for t sC T, 

e £t < exp(e C(5 (logiV) 3 ) = exp(27r 2 c 5 log N) = N 2 ^ Cs . 
(Recall the definition of e at the beginning of the section.) Therefore, 

N 



i=l 
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and because ^ holds, 
where k = 1 — 2tt 2 cs — 5 > for a small enough choice of eg. As for P(V^), since 



F(W t >N)^ ^ N- K , 



= log N J 1 - > C 1 - ^ lo S ^ 



for all iV sufficiently large, by Lemma 3.5 



P(V?) < N- K . 



By (29), we now have a bound for any fixed time t ^T, 

F(X N (t) ^ fj,t) < P(Xiv(t) > ^t) < 1N~ K . (33) 

We now extend this bound to hold for all t ^ T. Let i& = i(log iV) -1 , k = 1, . . . (^(log A'') 2 , 
so that ifc forms a regular partition of [0,T] with spacings of size l/(logA r ). During each 
it is possible to check that X;\i(t) has small fluctuations. The key observation here 
is that X^(t) is piecewise Brownian and only jumps to the right - never to the left. Therefore, 
during the interval, the minimum cannot travel too far right of //tfc+i due to the cost of motion 
to the left. 

We now fix some large constant K > and define the bad events 

B k = \ sup X N {s) ^fit k + K\, (34) 

and the good events G k = {A^v(^fc) ^ l^t k }. 

Given the Brunet-Derrida system at time t k -i, consider the coupled free branching Brow- 
nian motion started from these N particles and for a particle Xi(t k ) at time t k , let Yi(s) be 
its ancestor at time s ^ t. We see by the observation above, the probability of the event 
B k n G k is bounded by the probability of the analogous event for the branching Brownian 
motion, namely 

\ 31 ^ i sC N(t k ), sup Yiis) ^ fit k + K, Xi(t k ) ^ fit k \ 



By a union bound and the many-to-one lemma (Lemma 2.1) 



F(B k n G k ) ^ jVe*fc-*fc-ip sup Z(s) - Z(t k ) ^ K 

\se[tfe_i,t fc ] j 

^ 2iVe (losAr)_1 P(Z((logiV)- 1 ) ^ K) 
■C 4iVe~5 J 



-\K 2 log At 



where (Z(u),u ^ 0) is a Brownian motion. So for K > \/2(l + k), 

3 k nG k )^m- K . (35) 
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x — £ log N 




time 



k log AT 



Figure 5: Diagram reference for proof of Theorem 1.11 



We can sum the conclusion of ( 33 ) and ( 35 ) over all k to show that 

^ (log TV) 2 

P ( sup {X N (t) - l it}>K + /x(log iV)- 1 < V (F(B k n G k ) + P(G&)) 



6c s (\ogNYN- K -»• 



□ 



4 Proof of Theorem 1.11 



Consider a Brunet-Derrida particle system with N particles, started from an initial config- 
uration satisfying ([8]), driven by the linear score function s(x) = (x,X) = x. As before, let 
T = cs(logN) 3 where cs > is a small constant depending only on 5 which we will fix 

later on. Let £ > be small enough that 5' = f 5 + \/2£ < 1. Note that Q implies that if 
Y n {t) = X n {t)-x + i log N, then 

A 7 " 



n=l 



Thus by Proposition 1.12 with probability tending to 1, for all t ^ T, 

Xjy(t) < S-^logiV + v^t. 



(36) 
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On this event, 

X N (t) < w(t) = x- ^logiV + /A (37) 

where 

^' = ^2 t— — — 777-5- • (38) 

2c 5 (logiV) 2 V 7 

The function u>(t) is a linear boundary which will act as a killing wall. Note that 
u;(0) =x- -flogJV, io(T) = x-£logiV + V^T, 



and thus by (37), if a particle of never hits and starts to its right (i.e., -Xj(O) ^ w(0)), 
then it will survive selection in the Brunet-Derrida system. 

Now, let Qfj,i{y) be the probability that a branching Brownian motion starting from one 
particle at y > survives killing at a wall fi't for all time. Then the probability of a particle 
at position y or greater to survive in the Brunet-Derrida system until time T is greater than 
the probability it survives killing at the wall w(t) until time T, which is in turn bounded 
below by Q„/((£/2) log N). 

By Theorem 1 in [5], we deduce that for c s < £ 3 /(2V2n 2 ), Q m /((£/2) log AT) ^ 1 as 
iV — > 00, and hence any particle at x has descendants alive at time T, as desired. □ 



5 Proof of Theorem 11.71 



Let H = {x £ M. d : (x, A) = 0} be the orthogonal hyperplane to A. Recall that when a 
Brunet-D errid a particle system is driven by a linear s(x) = (x,X). We have already shown 
in Lemma 2.5 that for any initial condition, if t > (1 + k) log N for some k > and a > 3\/2, 



iV 



lim P (diann. ^ alogAQ = 1. 



On the other hand, by Theorem 1.11 if ^ initially holds with x = Xi(0) for some i ^ 2, 
both the rightmost and second rightmost particles have descendants alive at time T with high 
probability, where T = cs(logN) 3 , for some eg > possibly depending on 5 in Q. 

If ^ initially holds only for x = X\(0), we observe that the maximum particle at time 
u = log log N fails to branch by time u with probability ^ l/(logiV). Moreover, by Lemma 
2.2 on the event the maximum branches before time u, there will be with high probability 
at least two particles at time u with position x — 2 log log N. But by introducing an extra 



2 log log N term in (36) in the proof of Theorem 1.11, we see we can apply the conclusion of 



the theorem to both these particles. 

Therefore, at time u, two separate particles Xi(u),Xj(u) both have descendants alive at 
time T with high probability. Let E be this event and call Xi(T),Xj(T) the positions of two 
arbitrarily chosen descendants of both particles. Let Yi(t),Yj(t) denote the positions at time 
u ^ t ^ T of the ancestors of Xi(T) and Xj(T). Hence Yi(u) = Xi{u) and Yj(u) = Xj(u). 

Then note that if pn is the orthogonal projection onto H, ph(Yi) and pn(Yj) are inde- 
pendent (d — l)-dimensional Brownian motions on H on the time interval [u, T] (see the end 



of the proof of Theorem 1.4). Thus 



diaim 1 y \\p H (Yi(t)) - p H (Y 2 (t))\\, 
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on the event E, and in particular, 

lim inf lim inf P (diam^ ^ 77 (log N) 3 ^ 2 ) = 1, 
??->-0 7V-s>oo V / 

as desired. □ 
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